Backpropagation — Example 3 (overdetermined batch)

3 inputs, 1 output, no hidden layers, no activation; batch of 6 training samples; MSE loss.

6 training samples, only 4 parameters (w₁, w₂, w₃, b) — the system is overdetermined. No choice of the 4 parameters can fit all 6 targets exactly. Watch the loss stop decreasing as gradient descent runs.

Network

Controls

w₁

w₂

w₃

η (learning rate)

Iteration0

Loss (MSE)0.000000

Network

Controls

Loss vs. iteration